AITopics | maximum-entropy approach

Collaborating Authors

maximum-entropy approach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Neural Information Processing SystemsDec-24-2025, 07:27:31 GMT

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs). For MDPs that are ergodic and linear (i.e.

maximum-entropy approach, name change, off-policy evaluation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Review for NeurIPS paper: A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Neural Information Processing SystemsJan-26-2025, 15:04:13 GMT

Correctness: The main technical content seems to be correct. I have the following questions though: When using the linear assumption for the reward and the dynamics, the feature selection/setting is crutial. To relax the linear assumption, it is also mentioned, features can be pre-trained. What would be the recommended way to pre-learn it? For possible violation of the assumptions, how it would affect the results in practice?

average-reward mdp, maximum-entropy approach, off-policy evaluation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.40)

Add feedback

Review for NeurIPS paper: A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Neural Information Processing SystemsJan-26-2025, 15:04:06 GMT

This is a borderline paper. The paper is technically sound and addressing OPE in average-reward setting is an important problem. Despite that the work is an extension of Duan and Wang (for discounted setting) to the average-reward setting, the algorithm is somewhat different, as Duan and Wang uses FQE whereas the current paper performs stationary-distribution estimation. That said, there are a few weaknesses that the paper should try to address or at least discuss: 1. The entropy maximization is a novel algorithmic element which does not appear in previous approaches in the discounted setting.

average-reward mdp, maximum-entropy approach, off-policy evaluation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.40)

Add feedback

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Neural Information Processing SystemsOct-10-2024, 19:07:55 GMT

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs). For MDPs that are ergodic and linear (i.e. In a more general setting, when the feature dynamics are approximately linear and for arbitrary rewards, we propose a new approach for estimating stationary distributions with function approximation. We formulate this problem as finding the maximum-entropy distribution subject to matching feature expectations under empirical dynamics. We show that this results in an exponential-family distribution whose sufficient statistics are the features, paralleling maximum-entropy approaches in supervised learning.

average-reward mdp, maximum-entropy approach, off-policy evaluation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.92)

Add feedback

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

Lazic, Nevena, Yin, Dong, Farajtabar, Mehrdad, Levine, Nir, Gorur, Dilan, Harris, Chris, Schuurmans, Dale

arXiv.org Artificial IntelligenceJun-17-2020

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs). For MDPs that are ergodic and linear (i.e. where rewards and dynamics are linear in some known features), we provide the first finite-sample OPE error bound, extending existing results beyond the episodic and discounted cases. In a more general setting, when the feature dynamics are approximately linear and for arbitrary rewards, we propose a new approach for estimating stationary distributions with function approximation. We formulate this problem as finding the maximum-entropy distribution subject to matching feature expectations under empirical dynamics. We show that this results in an exponential-family distribution whose sufficient statistics are the features, paralleling maximum-entropy approaches in supervised learning. We demonstrate the effectiveness of the proposed OPE approaches in multiple environments.

artificial intelligence, function approximation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2006.1262

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback